home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
United Public Domain Gold 4
/
United Public Domain Gold 4.iso
/
fredfish
/
ff.0014.dms
/
ff.0014.adf
/
dex
/
dex.doc
< prev
next >
Wrap
Text File
|
1990-04-10
|
22KB
|
991 lines
DDDDEEEEXXXX ---- DDDDooooccccuuuummmmeeeennnnttttaaaattttiiiioooonnnn EEEExxxxttttrrrraaaaccccttttiiiioooonnnn UUUUttttiiiilllliiiittttyyyy
Fred Fish
345 Scottsdale Road
Pleasant Hill, Ca 94523
_A_B_S_T_R_A_C_T
Dex is a utility for extracting documentation from
program source files and compiling it into a form suit-
able for input to a text formatting program such as the
UNIX "nroff" utility.
The primary benefits of a utility like dex are:
(1) internal source code documentation is far more
likely to remain current than external documentation,
(2) the production of updated documentation can be done
largely under computer control.
Dex was developed by the author to aid in main-
taining documentation for his portable math library
(pml). Thus it is particularly useful in similar
applications where a programming project's source code
is split into many small files.
June 28, 1983
- 2 -
DDDDEEEEXXXX ---- DDDDooooccccuuuummmmeeeennnnttttaaaattttiiiioooonnnn EEEExxxxttttrrrraaaaccccttttiiiioooonnnn UUUUttttiiiilllliiiittttyyyy
Fred Fish
345 Scottsdale Road
Pleasant Hill, Ca 94523
_1. _I_N_T_R_O_D_U_C_T_I_O_N
Dex is a utility for extracting documentation from program
source files and compiling it into a form suitable for input to a
text formatting program. Currently, the only text formatting
program supported is the UNIX^ "nroff" utility. However, it
would be a relatively simple process to extend the current imple-
mentation to support other text processors.
Dex was originally developed to aid the author in maintain-
ing his portable math library (pml) documentation. However, dex
is useful for virtually any application where a medium to large
program is under development, particularly if the work is being
done simultaneously by several programmers.
Much of the dex implementation is done using _l_e_x and _y_a_c_c.
This has greatly reduced the amount of programming effort
required and made constant evolution relatively painless. It
does however effectively restrict usage of dex to UNIX systems
since few non-UNIX systems support these development aids.
__________________________
^UNIX is a trademark of Bell Laboratories.
- 2 -
_2. _S_O_U_R_C_E _C_O_D_E _E_X_A_M_P_L_E
_2._1. _C _S_o_u_r_c_e _C_o_d_e _S_a_m_p_l_e
The source code format for dex usage is quite similar to the
documentation format found in the Unix Programmer's Manual. This
leads to highly readable internal documentation in a familiar
format. For example, the documentation shown in figure 1 might
precede the source code for a "tolower" library function.
/*
* FUNCTION
*
* tolower convert character to lower case
*
* SYNOPSIS
*
* char tolower(ch)
* char ch;
*
* DESCRIPTION
*
* Tolower converts a character from upper case
* to lower case. Characters which are already
* lower case or are not alphabetic are returned
* unmodified.
*
* a => a
* A => a
* & => &
* 7 => 7
* .
* etc
*
*/
Figure 1
_S_o_u_r_c_e _C_o_d_e _E_x_a_m_p_l_e
_2._2. _D_e_x _S_a_m_p_l_e _O_u_t_p_u_t
Dex would process the text shown in figure 1, a documenta-
tion _r_e_g_i_o_n, to produce the text formatter source in figure 2.
However, before you try it, at least read the section on dynamic
reconfiguration, since dex will do _n_o_t_h_i_n_g without the
- 3 -
appropriate reconfiguration file.
.in 0
.fi
.bp
.sp 3
.ce
***********
.ce
* tolower *
.ce
***********
.sp 3
.ul 1
SYNOPSIS
.sp 1
.in 8
.nf
char tolower(ch)
char ch;
.sp 1
.in 0
.ul 1
DESCRIPTION
.sp 1
.in 8
.fi
Tolower converts a character from upper case
to lower case. Characters which are already
lower case or are not alphabetic are returned
unmodified.
.sp 1
.nf
.in 16
a => a
A => a
& => &
7 => 7
.
etc
.sp 1
Figure 2
_D_e_x _S_a_m_p_l_e _O_u_t_p_u_t
- 4 -
_2._3. _N_r_o_f_f _S_a_m_p_l_e _O_u_t_p_u_t
Dex currently knows nothing about macro packages, so its
output can be intermixed with other nroff source intended for any
of the macro packages (with restrictions noted in macro
package's documentation concerning command conflicts). Figure 3
shows the output of nroff for the input of figure 2.
***********
* tolower *
***********
_S_Y_N_O_P_S_I_S
char tolower(ch)
char ch;
_D_E_S_C_R_I_P_T_I_O_N
Tolower converts a character from upper case to lower
case. Characters which are already lower case or are not
alphabetic are returned unmodified.
a => a
A => a
& => &
7 => 7
.
etc
Figure 3
_N_r_o_f_f _S_a_m_p_l_e _O_u_t_p_u_t
_2._4. _D_y_n_a_m_i_c _R_e_c_o_n_f_i_g_u_r_a_t_i_o_n
Dex has _n_o built in knowledge about how to handle each docu-
mentation _t_o_p_i_c (such as "DESCRIPTION") with respect to formatter
modes such as fill, underline, indent, etc. This knowledge is
obtained via a mechanism referred to as _d_y_n_a_m_i_c _r_e_c_o_n_f_i_g_u_r_a_t_i_o_n,
whereby a special file is read from the directory containing the
processed source files. Dex can also be instructed to read a
reconfiguration file of the user's choosing.
- 5 -
Dex does not complain about the absence of the reconfigura-
tion file and if it is unreadable will essentially do nothing
while processing the specified source files. In the example of
figure 1, dex would typically be reconfigured to default to fill
mode for the text of the "DESCRIPTION" _t_o_p_i_c and to default to
nofill mode for the text "SYNOPSIS" _t_o_p_i_c.
- 6 -
_3. _D_E_F_I_N_I_T_I_O_N_S
To provide a consistent terminology for describing dex usage
some basic "buzz words" need to be defined.
_3._1. _D_o_c_u_m_e_n_t_a_t_i_o_n _T_o_p_i_c
The basic building block of internal documentation is the
documentation _t_o_p_i_c. A documentation _t_o_p_i_c is a contiguous sec-
tion of documentation which begins with recognition of a topic
_i_d_e_n_t_i_f_i_e_r and continues until recognition of another topic _i_d_e_n_-
_t_i_f_i_e_r, or program source code (junk).
In the previous example, the topic _i_d_e_n_t_i_f_i_e_r_s are "FUNC-
TION", "SYNOPSIS", "DESCRIPTION", and "BUGS". The _t_o_p_i_c_s are the
topic _i_d_e_n_t_i_f_i_e_r_s along with the corresponding topic _b_o_d_y (text).
Thus the "BUGS" _t_o_p_i_c consists of the the lines shown in figure
4.
* BUGS
*
* As implemented only works on systems for which
* native character set is ASCII.
*
Figure 4
_D_o_c_u_m_e_n_t_a_t_i_o_n _T_o_p_i_c _E_x_a_m_p_l_e
_3._2. _D_o_c_u_m_e_n_t_a_t_i_o_n _R_e_g_i_o_n
A documentation _r_e_g_i_o_n is a (possibly non-contiguous) sec-
tion of documentation comprised of one or more documentation
_t_o_p_i_c_s. It begins with recognition of a topic identifier which
has previously been configured to start a region and continues
until another region start identifier is encountered.
In the example of figure 1, only the topic identifier "FUNC-
TION" will generally be flagged as starting a new documentation
region. Thus the entire C comment is a documentation region
although it contains no embedded C code between topics.
- 7 -
_4. _S_O_U_R_C_E _L_I_N_E _D_E_T_A_I_L_S
Dex is set up to recognize comment lines for various source
file types. It recognizes the "*" character as starting a C com-
ment line, the "#" character as starting "make" and dex reconfi-
guration file comment lines, and a ";" character or "|" character
as starting assembler comment lines. These characters can be
preceeded by any number of blanks and tabs and must be followed
by at least one blank or tab.
Files processed by dex are divided strings which match one
of the following forms:
(1) <blanks/tabs><comment_string><blanks><text><newline>
Usually contains the topic identifier in the "text"
field. The <blanks> field consists of 2 or more
blanks.
(2) <blanks/tabs><comment_string><tab><text><newline> The
text is emitted in fill or nofill mode depending upon
the state of the EMITFILL flag (see reconfiguration
section).
(3) <blanks/tabs><comment_string><tab><tab><text><newline>
The text is emitted in nofill mode _r_e_g_a_r_d_l_e_s_s of the
state of the EMITFILL flag. The <text> is anything
except newline, including blanks and tabs.
(4) <blanks/tabs><comment_string><newline> Causes com-
mands for a single blank line to be emitted. Thus
blank comment lines map one for one with blank lines
in the nroff output.
(5) Any line not beginning with <blanks/tabs> followed by
<comment_string> is considered to be "junk" and ter-
minates processing of the current topic. Subsequent
lines which match (2), (3), or (4) above will be
ignored until the next match of (1).
Figure 1 is reproduced in figure 5 with each line marked
according to which pattern it matches. Carefully study figures
1, 2, 3, and 5 to determine what transformations are made in
going from the source code to the nroff output.
Note that there can be any number of leading tabs and blanks
(or none). Also, lines which have exactly one tab after the
string recognized as starting a comment can be emitted in either
filled or unfilled mode while lines having two tabs are emitted
in unfilled mode only. This allows reasonable handling and
- 8 -
5 /*
1 * FUNCTION
4 *
2 * tolower convert character to lower case
4 *
1 * SYNOPSIS
4 *
2 * char tolower(ch)
2 * char ch;
4 *
1 * DESCRIPTION
4 *
2 * Tolower converts a character from upper case
2 * to lower case. Characters which are already
2 * lower case or are not alphabetic are returned
2 * unmodified.
4 *
3 * a => a
3 * A => a
3 * & => &
3 * 7 => 7
3 * .
3 * etc
4 *
5 */
Figure 5
_S_o_u_r_c_e _C_o_d_e _E_x_a_m_p_l_e _R_e_v_i_s_i_t_e_d
appearance of fillable text interspersed with lists or other
non-fillable text.
It should also be pointed out that the file being processed
does _n_o_t have to be an ASCII text file. All input bytes are
masked with octal 177 so object files and other non-ASCII files
do not cause problems. This allows one to scan all files in a
directory, extracting documentation, without worrying about what
kind of files they are.
- 9 -
_5. _R_E_C_O_N_F_I_G_U_R_A_T_I_O_N
Dex is designed so that it can be reconfigured dynamically
by placing reconfiguration instructions in a file in the same
directory as the files to be processed. If a file name other than
the default reconfiguration file is desired, this too can be han-
dled by issuing the appropriate command line switch when dex is
invoked.
Each time dex processes the first file in any directory it
automagically looks for a file called ".dexrc", the default dex
reconfiguration file. It will _n_o_t complain if the file is not
found or is unreadable. Some of the things which can be recon-
figured are:
o _T_o_p_i_c _i_d_e_n_t_i_f_i_e_r _w_o_r_d_s: Specific strings can be
added to or removed from the internal tables.
o _F_i_l_l _m_o_d_e: The default handling for fill/unfill
mode can be enabled or disabled for specific docu-
mentation topics.
o _O_u_t_p_u_t _f_i_l_e_s: The output for specific documentation
regions can be redirected to files other than the
default standard output.
o _P_r_o_c_e_s_s_i_n_g _i_n_h_i_b_i_t_i_o_n: Processing of specified
documentation regions or topics can be disabled or
enabled dynamically.
_5._1. _R_e_c_o_n_f_i_g_u_r_a_t_i_o_n _F_i_l_e _F_o_r_m_a_t
Each line of a reconfiguration file is either a comment or a
reconfiguration directive. Comments are any lines beginning with
the "#" character (with or without leading whitespace), a blank,
or a form feed character. The directives currently supported
are:
.flags Set or reset flags for a given topic iden-
tifier.
.output Redirect output to a file for a given
topic identifier.
More than one directive line may be associated with any
- 10 -
given identifier. Thus, for example, if one line is insuffi-
cient to set or reset all desired flags for the "FILE" identif-
ier, then another directive can be issued for the remaining
flags.
Entries are free format within a given line; whitespace
separating directive fields is simply ignored. The formats of
the current directives are:
.flags <identifier> [-]<flag1> [-]<flag2> ...
.output <identifier> <filename>
_5._2. _I_d_e_n_t_i_f_i_e_r _F_l_a_g_s
Flags associated with specified topic identifiers can be set
or reset by the ".flags" reconfiguration directive. When dex
starts up all flags are initially reset. They are set by simply
naming them in the directive line (a leading "+" character is
optional). They are reset by naming them with a leading "-"
character. Flags not named in the reconfiguration file remain
unchanged. The current flags are:
PROCESS If set then topic is processed, other-
wise it is ignored.
EMITTEXT If set then topic text is emitted, oth-
erwise there is not output while topic
is processed.
EMITBOX If set then first word of topic text is
emitted enclosed in a box. (Doesn't
mesh well with EMITTEXT)
EMITFILL If set then text separated from comment
string ("*", "#", or ";") with a single
tab is emitted in fill mode.
EMITUL If set then topic identifier is under-
lined when emitted.
EMITBP If set then commands to start new page
are emitted prior to topic output (gen-
erally used with REGION).
- 11 -
REGION If set then topic identifier is con-
sidered to start a new documentation
region. If PROCESS is simultaneously
reset then _a_l_l topics until the next
region are suppressed.
_5._3. _R_e_c_o_n_f_i_g_u_r_a_t_i_o_n _F_i_l_e _E_x_a_m_p_l_e
Figure 6 is a reconfiguration file which specifies that the
identifier "FUNCTION" starts a documentation region, the region
is to be processed, the function name is to be printed in a box
and each function starts on a new page. Also, the identifier
"DESCRIPTION" is for a normal (non-region) topic, the topic is
processed, its text is emitted, the emitted text is fill mode by
default, and the identifier is to be underlined when printed.
.flags "FUNCTION" REGION PROCESS EMITBOX EMITBP
.flags "DESCRIPTION" -REGION PROCESS EMITTEXT
.flags "DESCRIPTION" EMITUL EMITFILL
Figure 6
_R_e_c_o_n_f_i_g_u_r_a_t_i_o_n _F_i_l_e _E_x_a_m_p_l_e
- 12 -
_6. _M_I_S_C_E_L_L_A_N_E_O_U_S
_6._1. _U_s_a_g_e
Dex is invoked with a command line of the form:
dex [-dhtv] [-r rcfile] file1 file2 file3 ...
d => enable debug mode
h => print internal help message
t => enable trace mode
v => enable verbose mode
r => use reconfiguration file <rcfile>
_6._2. _S_t_y_l_e _N_o_t_e_s
The following order is suggested for topics within a docu-
mentation region. It roughly follows the order which appears in
the Unix Programmer's Manual.
o FUNCTION or FILE or TOOL or NAME
o KEY WORDS
o SYNOPSIS
o DESCRIPTION
o RETURNS
o EXAMPLE
o FILES
o SEE ALSO
o DIAGNOSTICS
o WARNINGS or CAVEATS or RESTRICTIONS or BUGS
o AUTHOR
o PSEUDO CODE
- 13 -